-
Notifications
You must be signed in to change notification settings - Fork 788
[SYCL] Implement atomic_memory_scope_capabilities device query for OpenCL and Level Zero #8595
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
… and Level Zero backends. Signed-off-by: Maronas, Marcos <[email protected]>
Signed-off-by: Maronas, Marcos <[email protected]>
Signed-off-by: Maronas, Marcos <[email protected]>
sycl/plugins/opencl/pi_opencl.cpp
Outdated
// Because scopes are hierarchical, wider scopes support all narrower | ||
// scopes. SUB_GROUP and WORK_ITEM were already included in the | ||
// initialization, since WORK_GROUP is mandated minimum capality. | ||
if (devCapabilities && PI_DEVICE_ATOMIC_SCOPE_DEVICE) { | ||
result |= PI_MEMORY_SCOPE_DEVICE; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I suppose that's the answer to my question above about default value. Do you have a pointer to spec section which describes the hierarchical nature of those scopes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comes from OpenCL Spec: https://registry.khronos.org/OpenCL/specs/3.0-unified/html/OpenCL_API.html#CL_DEVICE_ATOMIC_MEMORY_CAPABILITIES
However, you can see there that WORK_ITEM
is actually an exception in OpenCL. I had an offline discussion with @gmlueck, @bashbaug and @Pennycook about this, and the conclusion is that SYCL should always return WORK_ITEM
. I think there will be an update in the spec to detail this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it, thanks! Perhaps it would be better to put that link to OCL spec directly into sources next to the comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought about that, but the OCL spec explicitly says WORK_ITEM
is an exception, so that might be confusing because we are not treating it as an exception. When SYCL spec is updated, we can obviously add a ref to that next to the comment. For now, I'm not sure if referencing OCL spec is helpful or more confusing. What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Technically, this code is within OpenCL plugin and therefore I think it shouldn't be too confusing to leave links to OpenCL spec in there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Discussed this offline. There is a corner case with WORK_ITEM
scope being defined differently between SYCL and OpenCL, which is a reason for ignoring this OpenCL exception.
I think that it would still make sense to leave an OCL spec link, but we should have a comment about why we ignore that work-item
exception. If we already have a link to SYCL spec clarification PR/issue, then even better, we can link it there as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I extended the current comment with further explanation and a link to OCL spec. Please, let me know if this is enough.
Signed-off-by: Maronas, Marcos <[email protected]>
Signed-off-by: Maronas, Marcos <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall LGTM
sycl/include/sycl/detail/cl.h
Outdated
@@ -11,7 +11,7 @@ | |||
// Suppress a compiler message about undefined CL_TARGET_OPENCL_VERSION | |||
// and define all symbols up to OpenCL 2.2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// and define all symbols up to OpenCL 2.2 | |
// and define all symbols up to OpenCL 3.0 |
sycl/plugins/opencl/pi_opencl.cpp
Outdated
if (paramValueSizeRet) | ||
*paramValueSizeRet = sizeof(result); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if (paramValueSizeRet) | |
*paramValueSizeRet = sizeof(result); | |
} | |
} | |
if (paramValueSizeRet) | |
*paramValueSizeRet = sizeof(result); | |
paramValueSizeRet
should be filled even if paramValue
is nullptr.
That is needed to support (usually) initial query:
piDeviceInfoGet(device, paramName, nullptr, 0, &size_ret);
// now allocate size_ret bytes and query result
sycl/plugins/opencl/pi_opencl.cpp
Outdated
@@ -1787,7 +1842,7 @@ pi_result piextKernelGetNativeHandle(pi_kernel kernel, | |||
// Windows: dynamically loaded plugins might have been unloaded already | |||
// when this is called. Sycl RT holds onto the PI plugin so it can be | |||
// called safely. But this is not transitive. If the PI plugin in turn | |||
// dynamically loaded a different DLL, that may have been unloaded. | |||
// dynamically loaded a different DLL, that may have been unloaded. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like an unrelated change, better to avoid
sycl/plugins/opencl/pi_opencl.cpp
Outdated
// Because scopes are hierarchical, wider scopes support all narrower | ||
// scopes. SUB_GROUP and WORK_ITEM were already included in the | ||
// initialization, since WORK_GROUP is mandated minimum capality. | ||
if (devCapabilities && PI_DEVICE_ATOMIC_SCOPE_DEVICE) { | ||
result |= PI_MEMORY_SCOPE_DEVICE; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Technically, this code is within OpenCL plugin and therefore I think it shouldn't be too confusing to leave links to OpenCL spec in there.
@intel/dpcpp-esimd-reviewers, could you please take a look at |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
esimd emu lgtm
Signed-off-by: Maronas, Marcos <[email protected]>
Signed-off-by: Maronas, Marcos <[email protected]>
Friendly ping @intel/llvm-reviewers-runtime @intel/dpcpp-l0-pi-reviewers |
case UR_DEVICE_INFO_ATOMIC_MEMORY_SCOPE_CAPABILITIES: { | ||
// There are no explicit restrictions in L0 programming guide, so assume all | ||
// are supported | ||
pi_memory_scope_capabilities result = | ||
PI_MEMORY_SCOPE_WORK_ITEM | PI_MEMORY_SCOPE_SUB_GROUP | | ||
PI_MEMORY_SCOPE_WORK_GROUP | PI_MEMORY_SCOPE_DEVICE | | ||
PI_MEMORY_SCOPE_SYSTEM; | ||
|
||
return ReturnValue(result); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jandres742 : please review this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
assumption is correct for L0 GPU driver, based on what we return already in OpenCL GPU driver.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
Signed-off-by: Maronas, Marcos <[email protected]>
Signed-off-by: Maronas, Marcos <[email protected]>
@intel/llvm-reviewers-runtime assigned reviewer is on holiday, could you please take a look? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! 🚀
Signed-off-by: Maronas, Marcos <[email protected]>
@maarquitos14, |
…enCL and Level Zero (intel#8595) Adds support to query devices for `atomic_memory_scope_capabilities`. The backends supported are OpenCL and Level Zero. For the rest of backends, it has been left unsupported. --------- Signed-off-by: Maronas, Marcos <[email protected]>
Adds support to query devices for
atomic_memory_scope_capabilities
. The backends supported are OpenCL and Level Zero. For the rest of backends, it has been left unsupported.